Adaptation of front end parameters in a speech recognizer

نویسندگان

  • Karthik Visweswariah
  • Ramesh A. Gopinath
چکیده

In this paper we consider the problem of adapting parameters of the algorithm used for extraction of features. Typical speech recognition systems use a sequence of modules to extract features which are then used for recognition. We present a method to adapt the parameters in these modules under a variety of criteria, e.g maximum likelihood, maximum mutual information. This method works under the assumption that the functions that the modules implement are differentiable with respect to their inputs and parameters. We use this framework to optimize a linear transform preceding the linear discriminant analysis (LDA) matrix and show that it gives significantly better performance than a linear transform after the LDA matrix with small amounts of data. We show that linear transforms can be estimated by directly optimizing likelihood or the MMI objective without using auxiliary functions. We also apply the method to optimize the Mel bins, and the compression power in a system that uses power law compression.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic gradient adaptation of front-end parameters

This paper examines how any parameter in the typical front end of a speech recognizer, can be rapidly and inexpensively adapted with usage. It focusses on firstly demonstrating that effective adaptation can be accomplished using low CPU/Memory cost stochastic gradient descent methods, secondly showing that adaptation can be done at time scales small enough to make it effective with just a singl...

متن کامل

Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion

This paper concerns the use of speech enhancement to improve automatic speech recognition (ASR) performance in noisy environments. Speech enhancement systems are usually designed separately from a back-end recognizer by optimizing the frontend parameters with signal-level criteria. Such a disjoint processing approach is not always useful for ASR. Indeed, timefrequency masking, which is widely u...

متن کامل

A bitstream-based front-end for wireless speech recognition on IS-136 communications system

In this paper, we propose a feature extraction method for a speech recognizer that operates in digital communication networks. The feature parameters are basically extracted by converting the quantized spectral information of a speech coder into a cepstrum. We also include the voiced/unvoiced information obtained from the bitstream of the speech coder in the recognition feature set. We performe...

متن کامل

An on-line acoustic compensation technique for robust speech recognition

In this work we report on the use of an on-line acoustic compensation technique for robust speech recognition. With this technique acoustic mismatch between training and actual conditions is reduced through acoustic mapping. At recognition stage, observation vectors delivered by the acoustic front-end are mapped into a reference acoustic space, while input data are exploited to update the stati...

متن کامل

مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی

In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...

متن کامل

Zero-Crossings with Adaptation for Automatic Speech Recognition

An auditory model based on zero-crossings with peak amplitudes (ZCPA) was used as a front-end for automatic speech recognition (ASR) with the perceptual property of adaptation as determined by psychoacoustic observations. The model performance was evaluated on the isolated digits (TIDIGITS) database using continuous density HMM recognizer in additive noise. Experimental results indicate that th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004